feat(realtime): allow tools to opt out of automatic response.create after completion#3033
feat(realtime): allow tools to opt out of automatic response.create after completion#3033jawwad-ali wants to merge 1 commit intoopenai:mainfrom
Conversation
…fter completion Realtime sessions hard-coded `start_response=True` when sending tool outputs back to the model, so every tool unconditionally triggered a follow-up `response.create`. Side-effect tools (analytics, background-job schedulers, telemetry) had no way to stay silent after completion. This adds a `start_response: bool = True` kw-only field to `FunctionTool` and threads it through the `@function_tool(...)` decorator. The realtime session then honors the field when emitting `RealtimeModelSendToolOutput` for both successful tool execution and approval rejections; the handoff path still triggers `response.create` because the next agent must speak. The default stays `True`, so existing tools keep their current behavior. Refs openai#2971
|
Thanks for the suggestion. I think having this flag on the function_tool decorator side may not be optimal in terms of the SDK design. Having a universal option only per realtime agent session may make sense, but I am not confident enough that adding such an option is really helpful for some use cases. We don't plan to add this option at least for now, so let us close this PR. |
|
@seratch, what do you think about avoiding forcing the creation of responses when a tool is executed? Do we have to force response creation upon tool execution? Another valid scenario for tool execution is just to add a response to the bot context, not trigger a response. In real-time voice, one might want a tool to save a user's/tool's data in the background without interrupting the audio stream. This is a widely used concept with modern frameworks. LangGraph treats agents as cyclical graphs (State Machines). Tool execution is just a node that modifies a global "State." It does not automatically trigger an LLM response unless you explicitly draw an edge to the LLM node. CrewAI is built around Tasks, and given Tools to accomplish those tasks. The agent will execute tools internally in a loop without returning a conversational response to the user until the Task is marked complete. Haystack ToolInvoker, as a component, is wired, connecting outputs to inputs. If you don't wire the ToolInvoker output back into a Generator (LLM), it doesn't generate a response. Forcing an LLM response after every tool execution is an anti-pattern for Realtime and Agentic workflows. Modern frameworks treat tool execution as state mutation (LangGraph, Rasa), pluggable functions (Semantic Kernel), background tasks (CrewAI), or explicit pipeline components (Haystack). Tying tool execution strictly to dialogue generation introduces unavoidable latency and prevents silent background operations, which are critical for smooth Voice/Realtime UX. Closing this PR w/o any alternative suggested out there is not a correct judgment. The framework around a model is an important aspect to make it usable.... Similar to this case, OpenAI really makes other models' jobs easier with such decisions. |
Summary
Realtime sessions hard-coded `start_response=True` when sending tool outputs back to the model, so every tool unconditionally triggered a follow-up `response.create`. Side-effect tools (analytics, background-job schedulers, telemetry) had no way to stay silent after completion — exactly the gap @aligokalppeker raised in #2971 (comment):
This PR exposes that toggle on `FunctionTool` and the `@function_tool(...)` decorator. The transport layer (`OpenAIRealtimeWebSocketModel._send_tool_output`) already honors `RealtimeModelSendToolOutput.start_response`; this change just lets tool authors reach it.
What changes
Default remains `True`, so every existing tool keeps current behavior.
Scope note (re: #2971's race report)
Per @seratch's response in the same thread, `v0.14.2` already routes `response.create` through `_ResponseCreateSequencer` and the active-response race is gated until a concrete repro exists. This PR does not touch the sequencer or add any tracking counters. It only addresses the per-tool ergonomics raised in @aligokalppeker's follow-up, which is independent of the race-detection path.
Test plan
Verified locally on Windows (Python 3.13):
New tests (6 total):
Issue number
Refs #2971
Checks